Search CORE

268 research outputs found

L0 Sparse Inverse Covariance Estimation

Author: Hero III Alfred O.
Marjanovic Goran
Publication venue
Publication date: 18/03/2015
Field of study

Recently, there has been focus on penalized log-likelihood covariance estimation for sparse inverse covariance (precision) matrices. The penalty is responsible for inducing sparsity, and a very common choice is the convex

l_1

norm. However, the best estimator performance is not always achieved with this penalty. The most natural sparsity promoting "norm" is the non-convex

l_0

penalty but its lack of convexity has deterred its use in sparse maximum likelihood estimation. In this paper we consider non-convex

l_0

penalized log-likelihood inverse covariance estimation and present a novel cyclic descent algorithm for its optimization. Convergence to a local minimizer is proved, which is highly non-trivial, and we demonstrate via simulations the reduced bias and superior quality of the

l_0

penalty as compared to the

l_1

penalty

arXiv.org e-Print Archive

Regularized Block Toeplitz Covariance Matrix Estimation via Kronecker Product Expansions

Author: Greenewald Kristjan
Hero III Alfred O.
Publication venue
Publication date: 01/01/2014
Field of study

In this work we consider the estimation of spatio-temporal covariance matrices in the low sample non-Gaussian regime. We impose covariance structure in the form of a sum of Kronecker products decomposition (Tsiligkaridis et al. 2013, Greenewald et al. 2013) with diagonal correction (Greenewald et al.), which we refer to as DC-KronPCA, in the estimation of multiframe covariance matrices. This paper extends the approaches of (Tsiligkaridis et al.) in two directions. First, we modify the diagonally corrected method of (Greenewald et al.) to include a block Toeplitz constraint imposing temporal stationarity structure. Second, we improve the conditioning of the estimate in the very low sample regime by using Ledoit-Wolf type shrinkage regularization similar to (Chen, Hero et al. 2010). For improved robustness to heavy tailed distributions, we modify the KronPCA to incorporate robust shrinkage estimation (Chen, Hero et al. 2011). Results of numerical simulations establish benefits in terms of estimation MSE when compared to previous methods. Finally, we apply our methods to a real-world network spatio-temporal anomaly detection problem and achieve superior results.Comment: To appear at IEEE SSP 2014 4 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Decomposable Principal Component Analysis

Author: Hero III Alfred O.
Wiesel Ami
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/08/2008
Field of study

We consider principal component analysis (PCA) in decomposable Gaussian graphical models. We exploit the prior information in these models in order to distribute its computation. For this purpose, we reformulate the problem in the sparse inverse covariance (concentration) domain and solve the global eigenvalue problem using a sequence of local eigenvalue problems in each of the cliques of the decomposable graph. We demonstrate the application of our methodology in the context of decentralized anomaly detection in the Abilene backbone network. Based on the topology of the network, we propose an approximate statistical graphical model and distribute the computation of PCA

arXiv.org e-Print Archive

Crossref

Scalable Hash-Based Estimation of Divergence Measures

Author: Hero III Alfred O.
Noshad Morteza
Publication venue
Publication date: 01/01/2018
Field of study

We propose a scalable divergence estimation method based on hashing. Consider two continuous random variables

X

and

Y

whose densities have bounded support. We consider a particular locality sensitive random hashing, and consider the ratio of samples in each hash bin having non-zero numbers of Y samples. We prove that the weighted average of these ratios over all of the hash bins converges to f-divergences between the two samples sets. We show that the proposed estimator is optimal in terms of both MSE rate and computational complexity. We derive the MSE rates for two families of smooth functions; the H\"{o}lder smoothness class and differentiable functions. In particular, it is proved that if the density functions have bounded derivatives up to the order

d/2

, where

d

is the dimension of samples, the optimal parametric MSE rate of

O(1/N)

can be achieved. The computational complexity is shown to be

O(N)

, which is optimal. To the best of our knowledge, this is the first empirical divergence estimator that has optimal computational complexity and achieves the optimal parametric MSE estimation rate.Comment: 11 pages, Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spai

arXiv.org e-Print Archive

Crossref

Node Removal Vulnerability of the Largest Component of a Network

Author: Chen Pin-Yu
Hero III Alfred O.
Publication venue
Publication date: 08/03/2014
Field of study

The connectivity structure of a network can be very sensitive to removal of certain nodes in the network. In this paper, we study the sensitivity of the largest component size to node removals. We prove that minimizing the largest component size is equivalent to solving a matrix one-norm minimization problem whose column vectors are orthogonal and sparse and they form a basis of the null space of the associated graph Laplacian matrix. A greedy node removal algorithm is then proposed based on the matrix one-norm minimization. In comparison with other node centralities such as node degree and betweenness, experimental results on US power grid dataset validate the effectiveness of the proposed approach in terms of reduction of the largest component size with relatively few node removals.Comment: Published in IEEE GlobalSIP 201

arXiv.org e-Print Archive

CiteSeerX